並行處理器程式設計：實務導向入門：大分裂時代：計算性能的演變軌跡

這大分裂標誌著微處理器歷史上的地殼劇變。在 2001 年至 2009 年間，中央處理器（CPU）與圖形處理器（GPU）的性能發展軌跡分道揚鑣，形成巨大的能力差距。當傳統 CPU 遭遇 功耗壁壘——隨著時鐘頻率提升而產生無法承受的發熱——圖形處理器利用其龐大的消費級 使用者基礎 遊戲市場，以支持其轉向極端平行運算的策略。

關鍵轉折點

到了 2003 年，差距開始擴大。中央處理器持續專注於順序邏輯與低延遲，而圖形處理器則將其電晶體預算集中於 算術邏輯單元（ALU）。這導致圖形處理器從千兆浮點運算（GFLOPS）過渡到 太赫茲浮點運算（Teraflops） 的吞吐量，而中央處理器則維持較為平緩的成長曲線。

截至 2009 年，高階的英特爾 i7-960 提供約 70 GFLOPS，而 NVIDIA GTX 280 則達到近 933 GFLOPS。這不僅是速度的提升；更是一次根本性的計算方式重構，強調 吞吐量 而非單一指令的速度。

TERMINALbash — 80x24

> Ready. Click "Run" to execute.

QUESTION 1

What primary constraint led to the 'Power Wall' for traditional CPUs?

The lack of available memory in the early 2000s.

Thermal and power limitations when increasing clock speeds.

A shortage of transistors on the silicon die.

The transition from 32-bit to 64-bit architectures.

QUESTION 2

According to the Great Divergence, which industry provided the economic engine for GPU R&D?

The Financial High-Frequency Trading market.

The Oil and Gas seismic exploration industry.

The Video Game industry.

The Cryptocurrency mining industry.

QUESTION 3

By 2009, how did the peak performance of an NVIDIA GTX 280 compare to an Intel Core i7-960?

They were roughly equal in throughput.

The CPU was twice as fast as the GPU.

The GPU was nearly an order of magnitude higher (~13x).

The GPU was 100x faster than the CPU.

QUESTION 4

GPUs achieve higher throughput by dedicating more transistors to which component?

Large Level-3 Caches.

Complex Branch Prediction logic.

Arithmetic Logic Units (ALUs).

Instruction Decoders.

QUESTION 5

What is the correct unit for measuring one trillion floating-point operations per second?

GFLOPS.

Teraflops.

Petaflops.

Megaflops.